Audio-visual recognition of spectrally reduced speech
نویسنده
چکیده
Perceptual experiments on audio-visual consonant recognition based on the spectral reduction of the speech (SRS) have been carried out with coherent and incoherent (McGurk) audio-visual pairs. The main interest of SRS in four sub-bands is to have a partial suppression of the information transmitted for the place of articulation. The integration of manner, restricted to the fricative/occlusive contrast, is also concerned and a new 'crossmanner' combination is tested. As expected, we have a good audiovisual complementarity for SRS and a high amount of McGurk responses, but new interesting effects are observed. For the interpretation of human confusion about place of articulation, the Bayesian model proposed by Massaro and Stork [8] is compared to a new place identification model which is based on averaging as well as on the separate identification of articulatory features. This decomposition is a promising way for the development of multistream speech recognition models.
منابع مشابه
Investigating the impact of artificial enhancement of lip visibility on the intelligibility of spectrally-distorted speech
The intelligibility of visual speech can be affected by a number of facial visual signals, e.g. lip emphasis, teeth and tongue visibility, and facial hair. This paper focuses on lip visibility. In the study presented in this paper, we use spectrally-distorted speech to train groups of non-native, English-speaking Saudi listeners using three different forms of speech: audio-only, audiovisual, an...
متن کاملA comparison of audiovisual and auditory-only training on the perception of spectrally-distorted speech
Recent research suggests that using visual speech in auditory training can improve auditory-only speech perception. The long term aim of our work is to investigate this approach for hearing-impaired users, in particular cochlear-implant users. In the pilot study presented in this paper, we use spectrally-distorted speech to train two different groups of normal hearing subjects: native English a...
متن کاملRecognition of isolated words using Zernike and MFCC features for audio visual speech recognition
Automatic Speech Recognition (ASR) by machine is an attractive research topic in signal processing domain and has attracted many researchers to contribute in this area. In recent year, there have been many advances in automatic speech reading system with the inclusion of audio and visual speech features to recognize words under noisy conditions. The objective of audio-visual speech recognition ...
متن کاملAn audio-visual approach to simultaneous-speaker speech recognition
Audio-visual speech recognition is an area with great potential to help solve challenging problems in speech processing. Difficulties due to background noises are significantly reduced by the additional information provided by extra visual features. The presence of additional speech from other talkers during recording may be viewed as one of the most difficult sources of noise. This paper prese...
متن کاملLarge-vocabulary audio-visual speech recognition: a summary of the Johns Hopkins Summer 2000 Workshop
We report a summary of the Johns Hopkins Summer 2000 Workshop on audio-visual automatic speech recognition (ASR) in the large-vocabulary, continuous speech domain. Two problems of audio-visual ASR were mainly addressed: Visual feature extraction and audio-visual information fusion. First, image transform and model-based visual features were considered, obtained by means of the discrete cosine t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001